Search CORE

35 research outputs found

QuesNet: A Unified Representation for Heterogeneous Test Questions

Author: Boopathiraj C
Devlin Jacob
Douglas David E
Duan Huizhong
Glorot Xavier
Kingma Diederik P
Ngiam Jiquan
Zhang Liang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/05/2019
Field of study

Understanding learning materials (e.g. test questions) is a crucial issue in online learning systems, which can promote many applications in education domain. Unfortunately, many supervised approaches suffer from the problem of scarce human labeled data, whereas abundant unlabeled resources are highly underutilized. To alleviate this problem, an effective solution is to use pre-trained representations for question understanding. However, existing pre-training methods in NLP area are infeasible to learn test question representations due to several domain-specific characteristics in education. First, questions usually comprise of heterogeneous data including content text, images and side information. Second, there exists both basic linguistic information as well as domain logic and knowledge. To this end, in this paper, we propose a novel pre-training method, namely QuesNet, for comprehensively learning question representations. Specifically, we first design a unified framework to aggregate question information with its heterogeneous inputs into a comprehensive vector. Then we propose a two-level hierarchical pre-training algorithm to learn better understanding of test questions in an unsupervised way. Here, a novel holed language model objective is developed to extract low-level linguistic features, and a domain-oriented objective is proposed to learn high-level logic and knowledge. Moreover, we show that QuesNet has good capability of being fine-tuned in many question-based tasks. We conduct extensive experiments on large-scale real-world question data, where the experimental results clearly demonstrate the effectiveness of QuesNet for question understanding as well as its superior applicability

arXiv.org e-Print Archive

Crossref

Applying Deep Learning To Airbnb Search

Author: Abdool Mustafa
Barrow-Williams Nick
Collins Brendan M.
Duan Huizhong
Haldar Malay
Legrand Thomas
Ramanathan Prashant
Turnbull Bradley C.
Xu Tao
Yang Shulin
Zhang Qing
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/10/2018
Field of study

The application to search ranking is one of the biggest machine learning success stories at Airbnb. Much of the initial gains were driven by a gradient boosted decision tree model. The gains, however, plateaued over time. This paper discusses the work done in applying neural networks in an attempt to break out of that plateau. We present our perspective not with the intention of pushing the frontier of new modeling techniques. Instead, ours is a story of the elements we found useful in applying neural networks to a real life product. Deep learning was steep learning for us. To other teams embarking on similar journeys, we hope an account of our struggles and triumphs will provide some useful pointers. Bon voyage!Comment: 8 page

arXiv.org e-Print Archive

Crossref

Learning to Ask: Question-based Sequential Bayesian Product Search

Author: Ai Qingyao
Duan Huizhong
Gordon
Gupta TE
Gysel Christophe Van
Sun Yueming
Wen Zheng
Zhang Yongfeng
Publication venue
Publication date: 01/01/2019
Field of study

Product search is generally recognized as the first and foremost stage of online shopping and thus significant for users and retailers of e-commerce. Most of the traditional retrieval methods use some similarity functions to match the user's query and the document that describes a product, either directly or in a latent vector space. However, user queries are often too general to capture the minute details of the specific product that a user is looking for. In this paper, we propose a novel interactive method to effectively locate the best matching product. The method is based on the assumption that there is a set of candidate questions for each product to be asked. In this work, we instantiate this candidate set by making the hypothesis that products can be discriminated by the entities that appear in the documents associated with them. We propose a Question-based Sequential Bayesian Product Search method, QSBPS, which directly queries users on the expected presence of entities in the relevant product documents. The method learns the product relevance as well as the reward of the potential questions to be asked to the user by being trained on the search history and purchase behavior of a specific user together with that of other users. The experimental results show that the proposed method can greatly improve the performance of product search compared to the state-of-the-art baselines.Comment: This paper is accepted by CIKM 201

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Substantial transition to clean household energy mix in rural China

Author: Chen Yilin
Chen Yuanchen
Cheng Hefa
Du Wei
Duan Yonghong
Duo Jia
Fan Fenggui
Huang Lei
Jiangtulu Bahabaike
Ju Tianzhen
Li Shunxin
Li Yungui
Liu Fenggui
Liu Xianli
Luo Zhihan
Meng Jing
Meng Wenjun
Nan Ying
Pan Bo
Pan Yanfang
Shen Guofeng
Shen Huizhong
Tao Shu
Tian Yanlin
Wang Bin
Wang Lizhi
Wang Mu
Xiong Rui
Xue Bing
Zeng Eddy
Zhan Chao
Publication venue: 'Oxford University Press (OUP)'
Publication date: 14/03/2022
Field of study

The household energy mix has significant impacts on human health and climate, as it contributes greatly to many health- and climate-relevant air pollutants. Compared to the well-established urban energy statistical system, the rural household energy statistical system is incomplete and is often associated with high biases. Via a nationwide investigation, this study revealed high contributions to energy supply from coal and biomass fuels in the rural household energy sector, while electricity comprised ∼20%. Stacking (the use of multiple sources of energy) is significant, and the average number of energy types was 2.8 per household. Compared to 2012, the consumption of biomass and coals in 2017 decreased by 45% and 12%, respectively, while the gas consumption amount increased by 204%. Increased gas and decreased coal consumptions were mainly in cooking, while decreased biomass was in both cooking (41%) and heating (59%). The time-sharing fraction of electricity and gases (E&G) for daily cooking grew, reaching 69% in 2017, but for space heating, traditional solid fuels were still dominant, with the national average shared fraction of E&G being only 20%. The non-uniform spatial distribution and the non-linear increase in the fraction of E&G indicated challenges to achieving universal access to modern cooking energy by 2030, particularly in less-developed rural and mountainous areas. In some non-typical heating zones, the increased share of E&G for heating was significant and largely driven by income growth, but in typical heating zones, the time-sharing fraction was <5% and was not significantly increased, except in areas with policy intervention. The intervention policy not only led to dramatic increases in the clean energy fraction for heating but also accelerated the clean cooking transition. Higher income, higher education, younger age, less energy/stove stacking and smaller family size positively impacted the clean energy transition

UCL Discovery

PubMed Central

Intent modeling and automatic query reformulation for search engine systems

Author: Duan Huizhong
Publication venue
Publication date: 01/12/2013
Field of study

Understanding and modeling users' intent in search queries is an important topic in studying search engine systems. Good understanding of search intent is required in order to achieve better search accuracy and better user experience. In this thesis work, I identify and study three major problems in the subject: ambiguous search intent, ineffective query formulation and vague relevance criteria. To systematically study these problems, the thesis consists of three parts. In the first part, I study search intent ambiguity in search engine queries and propose a click pattern-based method that captures ambiguous search intent based on behavioral difference rather than semantic difference. Analysis shows that the proposed method is more accurate and robust in measuring query ambiguity. In the second part, I study how to provide query formulation support to facilitate users in expressing search intent. Query completion and correction, and syntactic query reformulation are proposed and studied in this part. Experiments show that the proposed query formulation support methods can help users formulate more effective queries and alleviate search difficulty. In the third part, I study how to model search intent so that we can gain insights about users' behaviors and leverage the knowledge to improve search engines. Two topics are studied in this part: modeling search intent with data level representation and discovering coordinated shopping intent in product search. It is shown that the proposed methods can not only discover meaningful user intent but also improve search and other related applications. The proposed models and algorithms in the thesis are general and can be applied to improve search accuracy in potentially many different search engines. As a systematic study on intent modeling and automatic query reformulation in search engine systems, this thesis work also provides a road map to future exploration on intent understanding and analysis

Illinois Digital Environment for Access to Learning and Scholarship Repository

Online Spelling Correction for Query Completion

Author: Bo-june (paul Hsu
Huizhong Duan
Publication venue
Publication date: 01/01/2011
Field of study

In this paper, we study the problem of online spelling correction for query completions. Misspelling is a common phenomenon among search engines queries. In order to help users effectively express their information needs, mechanisms for automatically correcting misspelled queries are required. Online spelling correction aims to provide spell corrected completion suggestions as a query is incrementally entered. As latency is crucial to the utility of the suggestions, such an algorithm needs to be not only accurate, but also efficient. To tackle this problem, we propose and study a generative model for input queries, based on a noisy channel transformation of the intended queries. Utilizing spelling correction pairs, we train a Markov n-gram transformation model that captures user spelling behavior in an unsupervised fashion. To find the top spellcorrected completion suggestions in real-time, we adapt the A* search algorithm with various pruning heuristics to dynamically expand the search space efficiently. Evaluation of the proposed methods demonstrates a substantial increase in the effectiveness of online spelling correction over existing techniques

CiteSeerX

Crossref